1. Import necessary library
  2. Data collecting
    • District code collection
    • Combine 5 years csv data to one dataframe
  3. Data Cleansing
  4. Data Analysis via data visualization
  5. Findings on data visualization
  6. Getting venue data via Foursquare API
  7. K-Mean clusting modelling on different district

Import necessary library

In [1]:
"""
!pip install wheel
!pip install pipwin
!pipwin install numpy
!pipwin install pandas
!pipwin install shapely
!pipwin install gdal
!pipwin install fiona
!pipwin install pyproj
!pipwin install six
!pipwin install rtree
!pipwin install geopandas
!pip install geocoder
!pip3 install folium
!pip3 install beautifulsoup4
!pip3 install seaborn 
!pip install missingno
"""
import pandas as pd
from pandas.api.types import CategoricalDtype
import requests
import geocoder
import folium
from bs4 import BeautifulSoup
from matplotlib import pyplot as plt
import seaborn as sns
import missingno as msno
import numpy as np
from pylab import rcParams
import geopandas as gpd
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\statsmodels\tools\_testing.py:19: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.
  import pandas.util.testing as tm

Import Data

In [2]:
data_2015 = pd.read_csv('./data/crime-incident-reports-2015.csv')
data_2016 = pd.read_csv('./data/crime-incident-reports-2016.csv')
data_2017 = pd.read_csv('./data/crime-incident-reports-2017.csv')
data_2018 = pd.read_csv('./data/crime-incident-reports-2018.csv')
district = pd.read_csv('./data/district.csv')

Checking their shape and total row count

In [3]:
print('2015: ',data_2015.shape)
print('2016: ',data_2016.shape)
print('2017: ',data_2017.shape)
print('2018: ',data_2018.shape)
total_row = data_2015.shape[0] + data_2016.shape[0] + data_2017.shape[0] + data_2018.shape[0] 
print(total_row)
2015:  (53597, 17)
2016:  (99430, 17)
2017:  (101338, 17)
2018:  (98888, 17)
353253

Concat 4 years data into one dataframe

In [4]:
data = pd.concat([data_2015,data_2016,data_2017,data_2018])
data.shape
Out[4]:
(353253, 17)
In [5]:
data.head(5)
Out[5]:
INCIDENT_NUMBER OFFENSE_CODE OFFENSE_CODE_GROUP OFFENSE_DESCRIPTION DISTRICT REPORTING_AREA SHOOTING OCCURRED_ON_DATE YEAR MONTH DAY_OF_WEEK HOUR UCR_PART STREET Lat Long Location
0 I192068249 2647 Other THREATS TO DO BODILY HARM B2 280 NaN 2015-08-28 10:20:00 2015 8 Friday 10 Part Two WASHINGTON ST 42.330119 -71.084251 (42.33011862, -71.08425106)
1 I192061894 1106 Confidence Games FRAUD - CREDIT CARD / ATM FRAUD C11 356 NaN 2015-08-20 00:00:00 2015 8 Thursday 0 Part Two CHARLES ST 42.300605 -71.061268 (42.30060543, -71.06126785)
2 I192038828 1107 Fraud FRAUD - IMPERSONATION A1 172 NaN 2015-11-02 12:24:00 2015 11 Monday 12 Part Two ALBANY ST 42.334288 -71.072395 (42.33428841, -71.07239518)
3 I192008877 1107 Fraud FRAUD - IMPERSONATION E18 525 NaN 2015-07-31 10:00:00 2015 7 Friday 10 Part Two WINGATE RD 42.237009 -71.129566 (42.23700950, -71.12956606)
4 I182090828 1102 Fraud FRAUD - FALSE PRETENSE / SCHEME D4 159 NaN 2015-12-01 12:00:00 2015 12 Tuesday 12 Part Two UPTON ST 42.342432 -71.072258 (42.34243222, -71.07225766)

Setting district to dictionary format

In [6]:
district = district.set_index("DISTRICT")
#district.head()
In [7]:
dict_district = district.to_dict()
dict_district = dict_district['DISTRICT_NAME']
dict_district
Out[7]:
{'A1': 'DOWNTOWN',
 'A15': 'CHARLESTOWN',
 'A7': 'EAST BOSTON',
 'B2': 'BOXBURY',
 'B3': 'MATTAPAN',
 'C6': 'SOUTH BOSTON',
 'C11': 'DORCHESTER',
 'D4': 'SOUTH END',
 'D14': 'BRIGHTON',
 'E5': 'WEST BOXBURY',
 'E13': 'JAMAICA PLAIN',
 'E18': 'HYDE PARK'}

Dropping rows if district is nan or external

In [8]:
data = data.drop(data[data.DISTRICT=='External'].index)
data = data[data.DISTRICT.notna()]
data.shape
Out[8]:
(351426, 17)
In [9]:
data['DISTRICT'].unique()
#data['District_name'] = district_name
#data.head(15)
Out[9]:
array(['B2', 'C11', 'A1', 'E18', 'D4', 'B3', 'C6', 'D14', 'A7', 'E5',
       'E13', 'A15'], dtype=object)

Adding district name into data

In [10]:
district_name=[]
for i in data['DISTRICT']:
    for j in dict_district:
        if (i ==j):
            district_name.append(dict_district[j])
    
In [11]:
data['District_name'] = district_name
data.head()
Out[11]:
INCIDENT_NUMBER OFFENSE_CODE OFFENSE_CODE_GROUP OFFENSE_DESCRIPTION DISTRICT REPORTING_AREA SHOOTING OCCURRED_ON_DATE YEAR MONTH DAY_OF_WEEK HOUR UCR_PART STREET Lat Long Location District_name
0 I192068249 2647 Other THREATS TO DO BODILY HARM B2 280 NaN 2015-08-28 10:20:00 2015 8 Friday 10 Part Two WASHINGTON ST 42.330119 -71.084251 (42.33011862, -71.08425106) BOXBURY
1 I192061894 1106 Confidence Games FRAUD - CREDIT CARD / ATM FRAUD C11 356 NaN 2015-08-20 00:00:00 2015 8 Thursday 0 Part Two CHARLES ST 42.300605 -71.061268 (42.30060543, -71.06126785) DORCHESTER
2 I192038828 1107 Fraud FRAUD - IMPERSONATION A1 172 NaN 2015-11-02 12:24:00 2015 11 Monday 12 Part Two ALBANY ST 42.334288 -71.072395 (42.33428841, -71.07239518) DOWNTOWN
3 I192008877 1107 Fraud FRAUD - IMPERSONATION E18 525 NaN 2015-07-31 10:00:00 2015 7 Friday 10 Part Two WINGATE RD 42.237009 -71.129566 (42.23700950, -71.12956606) HYDE PARK
4 I182090828 1102 Fraud FRAUD - FALSE PRETENSE / SCHEME D4 159 NaN 2015-12-01 12:00:00 2015 12 Tuesday 12 Part Two UPTON ST 42.342432 -71.072258 (42.34243222, -71.07225766) SOUTH END
In [12]:
msno.matrix(data)
plt.show()

Since most of the cell from shooting columns are nan, this column will be deleted.

Dropping shooting column

In [13]:
data = data.drop(columns='SHOOTING')
data.head()
Out[13]:
INCIDENT_NUMBER OFFENSE_CODE OFFENSE_CODE_GROUP OFFENSE_DESCRIPTION DISTRICT REPORTING_AREA OCCURRED_ON_DATE YEAR MONTH DAY_OF_WEEK HOUR UCR_PART STREET Lat Long Location District_name
0 I192068249 2647 Other THREATS TO DO BODILY HARM B2 280 2015-08-28 10:20:00 2015 8 Friday 10 Part Two WASHINGTON ST 42.330119 -71.084251 (42.33011862, -71.08425106) BOXBURY
1 I192061894 1106 Confidence Games FRAUD - CREDIT CARD / ATM FRAUD C11 356 2015-08-20 00:00:00 2015 8 Thursday 0 Part Two CHARLES ST 42.300605 -71.061268 (42.30060543, -71.06126785) DORCHESTER
2 I192038828 1107 Fraud FRAUD - IMPERSONATION A1 172 2015-11-02 12:24:00 2015 11 Monday 12 Part Two ALBANY ST 42.334288 -71.072395 (42.33428841, -71.07239518) DOWNTOWN
3 I192008877 1107 Fraud FRAUD - IMPERSONATION E18 525 2015-07-31 10:00:00 2015 7 Friday 10 Part Two WINGATE RD 42.237009 -71.129566 (42.23700950, -71.12956606) HYDE PARK
4 I182090828 1102 Fraud FRAUD - FALSE PRETENSE / SCHEME D4 159 2015-12-01 12:00:00 2015 12 Tuesday 12 Part Two UPTON ST 42.342432 -71.072258 (42.34243222, -71.07225766) SOUTH END

Dropping rows without street location

In [14]:
data = data.dropna(subset=['Lat','Long'])
msno.matrix(data)
plt.show()

Transferring OCCURRED_ON_DATE to more useable columns

In [15]:
data['OCCURRED_ON_DATE'] = pd.to_datetime(data['OCCURRED_ON_DATE'])
data["DAY_OF_WEEK"] = pd.Categorical(data["DAY_OF_WEEK"], 
              categories=['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday','Sunday'],
              ordered=True)
def create_features(df):
    df['dayofweek'] = df['OCCURRED_ON_DATE'].dt.dayofweek
    df['quarter'] = df['OCCURRED_ON_DATE'].dt.quarter
    df['dayofyear'] = df['OCCURRED_ON_DATE'].dt.dayofyear
    df['dayofmonth'] = df['OCCURRED_ON_DATE'].dt.day
    df['weekofyear'] = df['OCCURRED_ON_DATE'].dt.weekofyear
    
    X = df[['dayofweek','quarter','dayofyear',
            'dayofmonth','weekofyear']]
    return X
create_features(data).head()

# CategoricalDytpe
data.quarter    = data.quarter.astype(CategoricalDtype())
data.dayofweek    = data.dayofweek.astype(CategoricalDtype())
data.dayofyear    = data.dayofyear.astype(CategoricalDtype())
data.dayofmonth    = data.dayofmonth.astype(CategoricalDtype())

data.head()
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\ipykernel_launcher.py:10: FutureWarning: Series.dt.weekofyear and Series.dt.week have been deprecated.  Please use Series.dt.isocalendar().week instead.
  # Remove the CWD from sys.path while we load stuff.
Out[15]:
INCIDENT_NUMBER OFFENSE_CODE OFFENSE_CODE_GROUP OFFENSE_DESCRIPTION DISTRICT REPORTING_AREA OCCURRED_ON_DATE YEAR MONTH DAY_OF_WEEK ... STREET Lat Long Location District_name dayofweek quarter dayofyear dayofmonth weekofyear
0 I192068249 2647 Other THREATS TO DO BODILY HARM B2 280 2015-08-28 10:20:00 2015 8 Friday ... WASHINGTON ST 42.330119 -71.084251 (42.33011862, -71.08425106) BOXBURY 4 3 240 28 35
1 I192061894 1106 Confidence Games FRAUD - CREDIT CARD / ATM FRAUD C11 356 2015-08-20 00:00:00 2015 8 Thursday ... CHARLES ST 42.300605 -71.061268 (42.30060543, -71.06126785) DORCHESTER 3 3 232 20 34
2 I192038828 1107 Fraud FRAUD - IMPERSONATION A1 172 2015-11-02 12:24:00 2015 11 Monday ... ALBANY ST 42.334288 -71.072395 (42.33428841, -71.07239518) DOWNTOWN 0 4 306 2 45
3 I192008877 1107 Fraud FRAUD - IMPERSONATION E18 525 2015-07-31 10:00:00 2015 7 Friday ... WINGATE RD 42.237009 -71.129566 (42.23700950, -71.12956606) HYDE PARK 4 3 212 31 31
4 I182090828 1102 Fraud FRAUD - FALSE PRETENSE / SCHEME D4 159 2015-12-01 12:00:00 2015 12 Tuesday ... UPTON ST 42.342432 -71.072258 (42.34243222, -71.07225766) SOUTH END 1 4 335 1 49

5 rows × 22 columns

Rename and reorder the column

In [16]:
rename = {'OFFENSE_CODE_GROUP':'Group',
          'OFFENSE_DESCRIPTION':'Description',
          'DISTRICT':'District',
          'STREET':'Street',        
          'OCCURRED_ON_DATE':'Date',
          'YEAR':'Year',
          'MONTH':'Month',
          'DAY_OF_WEEK':'Day',
          'HOUR':'Hour'}

data.rename(index=str, columns=rename, inplace=True)
In [17]:
data = data[['INCIDENT_NUMBER', 'OFFENSE_CODE', 'Group', 'Description', 
        'Date', 'Year', 'Month', 'Day', 'Hour','dayofweek',
       'quarter', 'dayofyear', 'dayofmonth', 'weekofyear','District', 'District_name' ,'REPORTING_AREA', 'UCR_PART',
       'Street', 'Lat', 'Long', 'Location']]

data.head()
Out[17]:
INCIDENT_NUMBER OFFENSE_CODE Group Description Date Year Month Day Hour dayofweek ... dayofmonth weekofyear District District_name REPORTING_AREA UCR_PART Street Lat Long Location
0 I192068249 2647 Other THREATS TO DO BODILY HARM 2015-08-28 10:20:00 2015 8 Friday 10 4 ... 28 35 B2 BOXBURY 280 Part Two WASHINGTON ST 42.330119 -71.084251 (42.33011862, -71.08425106)
1 I192061894 1106 Confidence Games FRAUD - CREDIT CARD / ATM FRAUD 2015-08-20 00:00:00 2015 8 Thursday 0 3 ... 20 34 C11 DORCHESTER 356 Part Two CHARLES ST 42.300605 -71.061268 (42.30060543, -71.06126785)
2 I192038828 1107 Fraud FRAUD - IMPERSONATION 2015-11-02 12:24:00 2015 11 Monday 12 0 ... 2 45 A1 DOWNTOWN 172 Part Two ALBANY ST 42.334288 -71.072395 (42.33428841, -71.07239518)
3 I192008877 1107 Fraud FRAUD - IMPERSONATION 2015-07-31 10:00:00 2015 7 Friday 10 4 ... 31 31 E18 HYDE PARK 525 Part Two WINGATE RD 42.237009 -71.129566 (42.23700950, -71.12956606)
4 I182090828 1102 Fraud FRAUD - FALSE PRETENSE / SCHEME 2015-12-01 12:00:00 2015 12 Tuesday 12 1 ... 1 49 D4 SOUTH END 159 Part Two UPTON ST 42.342432 -71.072258 (42.34243222, -71.07225766)

5 rows × 22 columns

In [18]:
data.describe().T
Out[18]:
count mean std min 25% 50% 75% max
OFFENSE_CODE 329214.0 2295.289110 1182.776062 111.000000 802.000000 2907.000000 3201.000000 3831.000000
Year 329214.0 2016.686757 1.041985 2015.000000 2016.000000 2017.000000 2018.000000 2018.000000
Month 329214.0 6.960381 3.331718 1.000000 4.000000 7.000000 10.000000 12.000000
Hour 329214.0 13.104968 6.284181 0.000000 9.000000 14.000000 18.000000 23.000000
weekofyear 329214.0 28.544658 14.576327 1.000000 17.000000 30.000000 41.000000 53.000000
Lat 329214.0 42.296993 1.046401 -1.000000 42.297555 42.325574 42.348624 42.395042
Long 329214.0 -71.042017 1.692250 -71.178674 -71.097223 -71.077565 -71.062563 -1.000000

Visualizing crime count by District and UCR Categeory

In [19]:
dis = data.groupby(by=["District","UCR_PART"]).size()
s = dis.to_frame()
s = s.reset_index()
s.columns = ["District","UCR_PART","Crime Counts"]
ax = sns.barplot(x ="District", y = 'Crime Counts', data = s, hue = "UCR_PART" )
plt.legend(title = 'UCR_PART', bbox_to_anchor = (1, 1))
Out[19]:
<matplotlib.legend.Legend at 0x18930ee1188>

Crime count grouped by district and crime type

In [20]:
rcParams["figure.figsize"] = 18,7
sns.set(font_scale=1.75)
order = data['Group'].value_counts().head(5).index
sns.countplot(data = data, x='Group',hue='District', order = order,  saturation=2,linewidth=1)
Out[20]:
<matplotlib.axes._subplots.AxesSubplot at 0x18935440248>

Analyzing crime with weekdays and hour

In [21]:
sns.heatmap(pd.pivot_table(data = data, index = "dayofweek", 
                              columns = "Hour", values = "INCIDENT_NUMBER", aggfunc = 'count'), 
                              cmap = 'Reds')
Out[21]:
<matplotlib.axes._subplots.AxesSubplot at 0x18935446548>

Analyzing crime with Month

In [22]:
grouped = data.groupby(['Month','District']).count()
sns.boxplot(x ="Month", y = "Group", data = grouped.reset_index(), palette="ch:.102");
In [23]:
district = data['District_name'].unique()
latitude=[]
longitude=[]
coor = []
for i in range (len(district)):
#    response = requests.get('https://maps.googleapis.com/maps/api/geocode/json?address={}+BOSTON+MA&key=AIzaSyCGloVfNsewW00vIK30g9-GnrkJl3vH63s'.format(district[i]))
#    resp_json_payload = response.json()
#    coor.append([a['results'][0]['geometry']['location']['lat'],a['results'][0]['geometry']['location']['lng']])
#    latitude.append(a['results'][0]['geometry']['location']['lat']) 
#    longitude.append(a['results'][0]['geometry']['location']['lng']) 
    g = geocoder.arcgis('{} ,Boston'.format(district[i]))
    lat_lng_coords = g.latlng
    coor.append(lat_lng_coords)
    latitude.append(lat_lng_coords[0]) 
    longitude.append(lat_lng_coords[1]) 

district_coor = pd.DataFrame()
district_coor['district'] = district
district_coor['coor'] = coor
district_coor['latitude'] = latitude
district_coor['longitude'] = longitude
In [24]:
district_coor.head(15)
Out[24]:
district coor latitude longitude
0 BOXBURY [42.330303515648225, -71.08946869163574] 42.330304 -71.089469
1 DORCHESTER [42.351354908126154, -71.05284849998098] 42.351355 -71.052848
2 DOWNTOWN [42.35829000000007, -71.05662999999998] 42.358290 -71.056630
3 HYDE PARK [42.27477303496225, -71.11989847471231] 42.274773 -71.119898
4 SOUTH END [42.34256000000005, -71.07357999999994] 42.342560 -71.073580
5 MATTAPAN [42.278222288859574, -71.0960831569464] 42.278222 -71.096083
6 BRIGHTON [42.35213365368456, -71.12492527560583] 42.352134 -71.124925
7 EAST BOSTON [42.35141817326235, -71.05671435784329] 42.351418 -71.056714
8 WEST BOXBURY [42.35473968331843, -71.06251016215178] 42.354740 -71.062510
9 SOUTH BOSTON [42.3522498538783, -71.05568998397878] 42.352250 -71.055690
10 JAMAICA PLAIN [42.30584890846422, -71.11909201668144] 42.305849 -71.119092
11 CHARLESTOWN [42.3677501180056, -71.05905551335397] 42.367750 -71.059056
In [25]:
heatmap = folium.Map(location=coor[0], zoom_start=12)
"""
for index, row in district_coor.iterrows():
    folium.CircleMarker(
            row['coor'],
            radius=5,
            color='red',
            fill=True,
            popup = row['district'],
            fill_color='#3186cc',
            fill_opacity=0.7,
            parse_html=False).add_to(heatmap) 
"""
heatmap
Out[25]:
Make this Notebook Trusted to load map: File -> Trust Notebook

some of the district points are not accurate enough, I will use google map to verify and correct it.

In [26]:
district_coor.at[1,'coor'] = [42.293066, -71.071760]
district_coor.at[1,'latitude'] = 42.293066
district_coor.at[1,'longitude'] = -71.071760

district_coor.at[7,'coor'] = [42.370918, -71.039203]
district_coor.at[7,'latitude'] = 42.370918
district_coor.at[7,'longitude'] = -71.039203

district_coor.at[8,'coor'] = [42.280873, -71.162792]
district_coor.at[8,'latitude'] = 42.280873
district_coor.at[8,'longitude'] = -71.162792

district_coor.at[9,'coor'] = [42.337805, -71.049307]
district_coor.at[9,'latitude'] = 42.337805
district_coor.at[9,'longitude'] = -71.049307

district_coor.at[11,'coor'] = [42.378547, -71.061281]
district_coor.at[11,'latitude'] = 42.378547
district_coor.at[11,'longitude'] = -71.061281

district_coor
Out[26]:
district coor latitude longitude
0 BOXBURY [42.330303515648225, -71.08946869163574] 42.330304 -71.089469
1 DORCHESTER [42.293066, -71.07176] 42.293066 -71.071760
2 DOWNTOWN [42.35829000000007, -71.05662999999998] 42.358290 -71.056630
3 HYDE PARK [42.27477303496225, -71.11989847471231] 42.274773 -71.119898
4 SOUTH END [42.34256000000005, -71.07357999999994] 42.342560 -71.073580
5 MATTAPAN [42.278222288859574, -71.0960831569464] 42.278222 -71.096083
6 BRIGHTON [42.35213365368456, -71.12492527560583] 42.352134 -71.124925
7 EAST BOSTON [42.370918, -71.039203] 42.370918 -71.039203
8 WEST BOXBURY [42.280873, -71.162792] 42.280873 -71.162792
9 SOUTH BOSTON [42.337805, -71.049307] 42.337805 -71.049307
10 JAMAICA PLAIN [42.30584890846422, -71.11909201668144] 42.305849 -71.119092
11 CHARLESTOWN [42.378547, -71.061281] 42.378547 -71.061281

Adding boston border on map

In [27]:
gpf = gpd.read_file("./data/Zoning_Districts.geojson")
gpf
style = {'fillColor': '#00000000', 'color': '#000000','weight':2 , 'opacity':0.5}
folium.GeoJson(data=gpf['geometry'],style_function=lambda x: style).add_to(heatmap)
heatmap
Out[27]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Adding district marking on map

In [28]:
for index, row in district_coor.iterrows():
    folium.CircleMarker(
            row['coor'],
            radius=5,
            color='red',
            fill=True,
            popup = row['district'],
            fill_color='#3186cc',
            fill_opacity=0.7,
            parse_html=False).add_to(heatmap) 
heatmap
Out[28]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Adding Heatmap of crime on Map

In [29]:
from folium import plugins
heat = data[['Lat', 'Long']].values
heatmap.add_children(plugins.HeatMap(heat,radius=13))
heatmap
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\ipykernel_launcher.py:3: FutureWarning: Method `add_children` is deprecated. Please use `add_child` instead.
  This is separate from the ipykernel package so we can avoid doing imports until
Out[29]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Calling Foursquare API to get venues

In [30]:
CLIENT_ID = 'AQ4P5KPELSSWLOEVTEC0ET4FNEVSJJATZC1NUQ13YFMY11W1' # your Foursquare ID
CLIENT_SECRET = 'BMGVJICF0ZI0OAXWYEQRAU3DD4S503UH3VEG0BCKAWKIOWMF' # your Foursquare Secret
ACCESS_TOKEN = '1JVCUVC4B0EDL4JMLOSQ2INS35E3R2VYMEWS55U4UZW0FBMV' # your FourSquare Access Token
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)
Your credentails:
CLIENT_ID: AQ4P5KPELSSWLOEVTEC0ET4FNEVSJJATZC1NUQ13YFMY11W1
CLIENT_SECRET:BMGVJICF0ZI0OAXWYEQRAU3DD4S503UH3VEG0BCKAWKIOWMF

Writing a function to get the district venues and write the result into dataframe

In [31]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Calling the previous function

In [32]:
boston_venues = getNearbyVenues(names=district_coor['district'],
                                   latitudes=district_coor['latitude'],
                                   longitudes=district_coor['longitude'])
BOXBURY
DORCHESTER
DOWNTOWN
HYDE PARK
SOUTH END
MATTAPAN
BRIGHTON
EAST BOSTON
WEST BOXBURY
SOUTH BOSTON
JAMAICA PLAIN
CHARLESTOWN

Check the dataframe by its shape and groupby object

In [33]:
print(boston_venues.shape)
boston_venues.head()
(405, 7)
Out[33]:
Neighborhood Neighborhood Latitude Neighborhood Longitude Venue Venue Latitude Venue Longitude Venue Category
0 BOXBURY 42.330304 -71.089469 Dudley Café 42.329866 -71.083620 Café
1 BOXBURY 42.330304 -71.089469 Southwest Corridor Park 42.331830 -71.094111 Playground
2 BOXBURY 42.330304 -71.089469 Madison Field 42.332222 -71.086956 Soccer Field
3 BOXBURY 42.330304 -71.089469 Joe's Famous Steak & Cheese 42.328800 -71.083908 American Restaurant
4 BOXBURY 42.330304 -71.089469 Reggie Lewis Track & Athletic Center 42.332033 -71.093001 Track
In [34]:
boston_venues.groupby('Neighborhood').count()
Out[34]:
Neighborhood Latitude Neighborhood Longitude Venue Venue Latitude Venue Longitude Venue Category
Neighborhood
BOXBURY 12 12 12 12 12 12
BRIGHTON 29 29 29 29 29 29
CHARLESTOWN 40 40 40 40 40 40
DORCHESTER 10 10 10 10 10 10
DOWNTOWN 100 100 100 100 100 100
EAST BOSTON 44 44 44 44 44 44
HYDE PARK 16 16 16 16 16 16
JAMAICA PLAIN 21 21 21 21 21 21
MATTAPAN 6 6 6 6 6 6
SOUTH BOSTON 38 38 38 38 38 38
SOUTH END 76 76 76 76 76 76
WEST BOXBURY 13 13 13 13 13 13

Applying ont hot encoding

In [35]:
# one hot encoding
boston_onehot = pd.get_dummies(boston_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
boston_onehot['Neighborhood'] = boston_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [boston_onehot.columns[-1]] + list(boston_onehot.columns[:-1])
boston_onehot = boston_onehot[fixed_columns]

boston_onehot.head()
Out[35]:
Neighborhood Accessories Store African Restaurant American Restaurant Arepa Restaurant Art Gallery Asian Restaurant Athletics & Sports Automotive Shop Bagel Shop ... Trail Train Station Used Bookstore Vegetarian / Vegan Restaurant Video Store Vietnamese Restaurant Wine Bar Wine Shop Wings Joint Yoga Studio
0 BOXBURY 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
1 BOXBURY 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
2 BOXBURY 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
3 BOXBURY 0 0 1 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
4 BOXBURY 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0

5 rows × 137 columns

In [36]:
boston_onehot.shape
Out[36]:
(405, 137)
In [37]:
boston_grouped = boston_onehot.groupby('Neighborhood').mean().reset_index()
boston_grouped
Out[37]:
Neighborhood Accessories Store African Restaurant American Restaurant Arepa Restaurant Art Gallery Asian Restaurant Athletics & Sports Automotive Shop Bagel Shop ... Trail Train Station Used Bookstore Vegetarian / Vegan Restaurant Video Store Vietnamese Restaurant Wine Bar Wine Shop Wings Joint Yoga Studio
0 BOXBURY 0.000000 0.083333 0.083333 0.000000 0.000000 0.000000 0.000000 0.000000 0.00 ... 0.000000 0.000000 0.00 0.00 0.0000 0.000000 0.000000 0.000000 0.0000 0.000000
1 BRIGHTON 0.000000 0.000000 0.000000 0.000000 0.000000 0.034483 0.000000 0.034483 0.00 ... 0.000000 0.000000 0.00 0.00 0.0000 0.034483 0.000000 0.000000 0.0000 0.068966
2 CHARLESTOWN 0.000000 0.000000 0.025000 0.000000 0.000000 0.000000 0.025000 0.000000 0.00 ... 0.000000 0.000000 0.00 0.00 0.0000 0.000000 0.000000 0.000000 0.0000 0.025000
3 DORCHESTER 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.00 ... 0.000000 0.000000 0.00 0.00 0.0000 0.000000 0.000000 0.000000 0.0000 0.000000
4 DOWNTOWN 0.000000 0.000000 0.030000 0.000000 0.000000 0.020000 0.000000 0.000000 0.01 ... 0.000000 0.000000 0.01 0.01 0.0000 0.000000 0.000000 0.020000 0.0000 0.000000
5 EAST BOSTON 0.000000 0.000000 0.022727 0.000000 0.045455 0.000000 0.022727 0.000000 0.00 ... 0.000000 0.000000 0.00 0.00 0.0000 0.000000 0.000000 0.000000 0.0000 0.000000
6 HYDE PARK 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.00 ... 0.000000 0.000000 0.00 0.00 0.0625 0.000000 0.000000 0.000000 0.0625 0.000000
7 JAMAICA PLAIN 0.000000 0.000000 0.000000 0.000000 0.047619 0.000000 0.000000 0.000000 0.00 ... 0.047619 0.000000 0.00 0.00 0.0000 0.000000 0.000000 0.000000 0.0000 0.000000
8 MATTAPAN 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.00 ... 0.000000 0.000000 0.00 0.00 0.0000 0.000000 0.000000 0.000000 0.0000 0.000000
9 SOUTH BOSTON 0.000000 0.000000 0.026316 0.000000 0.000000 0.000000 0.000000 0.000000 0.00 ... 0.000000 0.000000 0.00 0.00 0.0000 0.000000 0.000000 0.000000 0.0000 0.000000
10 SOUTH END 0.013158 0.000000 0.039474 0.013158 0.013158 0.013158 0.000000 0.000000 0.00 ... 0.000000 0.000000 0.00 0.00 0.0000 0.000000 0.052632 0.039474 0.0000 0.000000
11 WEST BOXBURY 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.076923 0.00 ... 0.000000 0.076923 0.00 0.00 0.0000 0.000000 0.000000 0.000000 0.0000 0.000000

12 rows × 137 columns

In [38]:
num_top_venues = 5

for hood in boston_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = boston_grouped[boston_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')
----BOXBURY----
                 venue  freq
0                 Café  0.25
1             Pharmacy  0.08
2  American Restaurant  0.08
3   African Restaurant  0.08
4                Diner  0.08


----BRIGHTON----
                  venue  freq
0           Yoga Studio  0.07
1           Coffee Shop  0.07
2           Pizza Place  0.07
3  Gym / Fitness Center  0.03
4                   Gym  0.03


----CHARLESTOWN----
               venue  freq
0  Convenience Store  0.08
1        Pizza Place  0.08
2        Coffee Shop  0.08
3          Pet Store  0.05
4         Donut Shop  0.05


----DORCHESTER----
                  venue  freq
0   Fried Chicken Joint   0.1
1              Pharmacy   0.1
2  Fast Food Restaurant   0.1
3                  Park   0.1
4                Market   0.1


----DOWNTOWN----
                venue  freq
0  Seafood Restaurant  0.08
1       Historic Site  0.07
2         Coffee Shop  0.07
3                Park  0.05
4               Hotel  0.04


----EAST BOSTON----
                venue  freq
0                Park  0.07
1  Italian Restaurant  0.07
2         Pizza Place  0.07
3      Sandwich Place  0.05
4   Convenience Store  0.05


----HYDE PARK----
               venue  freq
0       Liquor Store  0.12
1       Home Service  0.06
2                Gym  0.06
3  Convenience Store  0.06
4        Pizza Place  0.06


----JAMAICA PLAIN----
               venue  freq
0               Park  0.14
1          Pet Store  0.10
2             Bakery  0.10
3      Deli / Bodega  0.05
4  Convenience Store  0.05


----MATTAPAN----
                        venue  freq
0              Ice Cream Shop  0.17
1               Event Service  0.17
2  Construction & Landscaping  0.17
3            Business Service  0.17
4                        Park  0.17


----SOUTH BOSTON----
          venue  freq
0   Pizza Place  0.11
1  Liquor Store  0.08
2          Bank  0.05
3    Sports Bar  0.05
4   Coffee Shop  0.05


----SOUTH END----
                venue  freq
0            Wine Bar  0.05
1           Wine Shop  0.04
2           Gift Shop  0.04
3  Mexican Restaurant  0.04
4                Park  0.04


----WEST BOXBURY----
                venue  freq
0       Grocery Store  0.23
1       Train Station  0.08
2        Liquor Store  0.08
3                 Pub  0.08
4  Salon / Barbershop  0.08


write a function to sort the venues in descending order

In [39]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

create the new dataframe and display the top 10 venues for each neighborhood

In [40]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = boston_grouped['Neighborhood']

for ind in np.arange(boston_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(boston_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()
Out[40]:
Neighborhood 1st Most Common Venue 2nd Most Common Venue 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue 9th Most Common Venue 10th Most Common Venue
0 BOXBURY Café Pharmacy American Restaurant African Restaurant Diner Playground Donut Shop Pizza Place Soccer Field Track
1 BRIGHTON Yoga Studio Coffee Shop Pizza Place Gym / Fitness Center Gym Music Venue Nightclub Food Court Food & Drink Shop Thai Restaurant
2 CHARLESTOWN Convenience Store Pizza Place Coffee Shop Pet Store Donut Shop Gastropub Yoga Studio Deli / Bodega Discount Store Restaurant
3 DORCHESTER Fried Chicken Joint Pharmacy Fast Food Restaurant Park Market Electronics Store Grocery Store Bank Sandwich Place Construction & Landscaping
4 DOWNTOWN Seafood Restaurant Historic Site Coffee Shop Park Hotel Sandwich Place American Restaurant Pub Salad Place Cocktail Bar
In [41]:
from sklearn.cluster import KMeans
# set number of clusters

kclusters = 5

boston_grouped_clustering = boston_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(boston_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\linear_model\least_angle.py:30: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  method='lar', copy_X=True, eps=np.finfo(np.float).eps,
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\linear_model\least_angle.py:167: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  method='lar', copy_X=True, eps=np.finfo(np.float).eps,
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\linear_model\least_angle.py:284: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  eps=np.finfo(np.float).eps, copy_Gram=True, verbose=0,
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\linear_model\least_angle.py:862: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  eps=np.finfo(np.float).eps, copy_X=True, fit_path=True,
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\linear_model\least_angle.py:1101: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  eps=np.finfo(np.float).eps, copy_X=True, fit_path=True,
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\linear_model\least_angle.py:1127: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  eps=np.finfo(np.float).eps, positive=False):
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\linear_model\least_angle.py:1362: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  max_n_alphas=1000, n_jobs=None, eps=np.finfo(np.float).eps,
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\linear_model\least_angle.py:1602: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  max_n_alphas=1000, n_jobs=None, eps=np.finfo(np.float).eps,
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\linear_model\least_angle.py:1738: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  eps=np.finfo(np.float).eps, copy_X=True, positive=False):
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\decomposition\online_lda.py:29: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  EPS = np.finfo(np.float).eps
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\ipykernel_launcher.py:6: FutureWarning: In a future version of pandas all arguments of DataFrame.drop except for the argument 'labels' will be keyword-only
  
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\cluster\k_means_.py:445: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  max_iter=max_iter, verbose=verbose)
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\cluster\k_means_.py:445: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  max_iter=max_iter, verbose=verbose)
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\cluster\k_means_.py:445: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  max_iter=max_iter, verbose=verbose)
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\cluster\k_means_.py:445: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  max_iter=max_iter, verbose=verbose)
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\cluster\k_means_.py:445: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  max_iter=max_iter, verbose=verbose)
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\cluster\k_means_.py:445: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  max_iter=max_iter, verbose=verbose)
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\cluster\k_means_.py:445: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  max_iter=max_iter, verbose=verbose)
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\cluster\k_means_.py:445: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  max_iter=max_iter, verbose=verbose)
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\cluster\k_means_.py:445: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  max_iter=max_iter, verbose=verbose)
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\cluster\k_means_.py:445: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  max_iter=max_iter, verbose=verbose)
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
C:\Users\Chun Ho Tse\Anaconda3\lib\site-packages\sklearn\metrics\pairwise.py:56: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  dtype = np.float
Out[41]:
array([2, 0, 0, 3, 0, 0, 0, 0, 1, 0])
In [42]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

boston_merged = district_coor

# merge boston_grouped with boston_data to add latitude/longitude for each neighborhood
boston_merged = boston_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='district')

boston_merged.head(20) # check the last columns!
Out[42]:
district coor latitude longitude Cluster Labels 1st Most Common Venue 2nd Most Common Venue 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue 6th Most Common Venue 7th Most Common Venue 8th Most Common Venue 9th Most Common Venue 10th Most Common Venue
0 BOXBURY [42.330303515648225, -71.08946869163574] 42.330304 -71.089469 2 Café Pharmacy American Restaurant African Restaurant Diner Playground Donut Shop Pizza Place Soccer Field Track
1 DORCHESTER [42.293066, -71.07176] 42.293066 -71.071760 3 Fried Chicken Joint Pharmacy Fast Food Restaurant Park Market Electronics Store Grocery Store Bank Sandwich Place Construction & Landscaping
2 DOWNTOWN [42.35829000000007, -71.05662999999998] 42.358290 -71.056630 0 Seafood Restaurant Historic Site Coffee Shop Park Hotel Sandwich Place American Restaurant Pub Salad Place Cocktail Bar
3 HYDE PARK [42.27477303496225, -71.11989847471231] 42.274773 -71.119898 0 Liquor Store Home Service Gym Convenience Store Pizza Place Pharmacy Supermarket Mexican Restaurant Buffet Gastropub
4 SOUTH END [42.34256000000005, -71.07357999999994] 42.342560 -71.073580 0 Wine Bar Wine Shop Gift Shop Mexican Restaurant Park American Restaurant Theater Italian Restaurant Pet Store Grocery Store
5 MATTAPAN [42.278222288859574, -71.0960831569464] 42.278222 -71.096083 1 Ice Cream Shop Event Service Construction & Landscaping Business Service Park Pizza Place New American Restaurant Nightclub Performing Arts Venue Pharmacy
6 BRIGHTON [42.35213365368456, -71.12492527560583] 42.352134 -71.124925 0 Yoga Studio Coffee Shop Pizza Place Gym / Fitness Center Gym Music Venue Nightclub Food Court Food & Drink Shop Thai Restaurant
7 EAST BOSTON [42.370918, -71.039203] 42.370918 -71.039203 0 Park Italian Restaurant Pizza Place Sandwich Place Convenience Store Café Mexican Restaurant Latin American Restaurant Restaurant Art Gallery
8 WEST BOXBURY [42.280873, -71.162792] 42.280873 -71.162792 4 Grocery Store Train Station Liquor Store Pub Salon / Barbershop Gym / Fitness Center Automotive Shop Pizza Place Bank Pharmacy
9 SOUTH BOSTON [42.337805, -71.049307] 42.337805 -71.049307 0 Pizza Place Liquor Store Bank Sports Bar Coffee Shop Bar Dessert Shop Bus Station Pharmacy Italian Restaurant
10 JAMAICA PLAIN [42.30584890846422, -71.11909201668144] 42.305849 -71.119092 0 Park Pet Store Bakery Deli / Bodega Convenience Store Garden Monument / Landmark Comedy Club Bookstore Grocery Store
11 CHARLESTOWN [42.378547, -71.061281] 42.378547 -71.061281 0 Convenience Store Pizza Place Coffee Shop Pet Store Donut Shop Gastropub Yoga Studio Deli / Bodega Discount Store Restaurant

""" for index, row in data.iterrows(): folium.CircleMarker([row['Lat'],row['Long']], radius=2, popup = folium.Popup(row['Street']), fill_color="#3db7e4" # divvy color ).add_to(heatmap) heatmap """

In [43]:
import matplotlib.cm as cm
import matplotlib.colors as colors
# create map
map_clusters = folium.Map(location=[latitude[0], longitude[0]], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(boston_merged['latitude'], boston_merged['longitude'], boston_merged['district'], boston_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(heatmap)
       
heatmap
Out[43]:
Make this Notebook Trusted to load map: File -> Trust Notebook
In [ ]: